AITopics | beta distribution

We introduce beta diffusion, a novel generative modeling method that integrates demasking and denoising to generate data within bounded ranges. Using scaled and shifted beta distributions, beta diffusion utilizes multiplicative transitions over time to create both forward and reverse diffusion processes, maintaining beta distributions in both the forward marginals and the reverse conditionals, given the data at any point in time. Unlike traditional diffusion-based generative models relying on additive Gaussian noise and reweighted evidence lower bounds (ELBOs), beta diffusion is multiplicative and optimized with KL-divergence upper bounds (KLUBs) derived from the convexity of the KL divergence. We demonstrate that the proposed KLUBs are more effective for optimizing beta diffusion compared to negative ELBOs, which can also be derived as the KLUBs of the same KL divergence with its two arguments swapped. The loss function of beta diffusion, expressed in terms of Bregman divergence, further supports the efficacy of KLUBs for optimization. Experimental results on both synthetic data and natural images demonstrate the unique capabilities of beta diffusion in generative modeling of range-bounded data and validate the effectiveness of KLUBs in optimizing diffusion models, thereby making them valuable additions to the family of diffusion-based generative models and the optimization techniques used to train them.

beta diffusion, klub, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

ORCA: Open-ended Response Correctness Assessment for Audio Question Answering

Sedláček, Šimon, Barahona, Sara, Yusuf, Bolaji, Herrera-Alarcón, Laura, Kesiraju, Santosh, Bolaños, Cecilia, Lozano-Diez, Alicia, Udupa, Sathvik, López, Fernando, Ferner, Allison, Duraiswami, Ramani, Černocký, Jan

arXiv.org Artificial IntelligenceDec-11-2025

Evaluating open-ended responses from large audio language models (LALMs) is challenging because human annotators often genuinely disagree on answer correctness due to multiple valid interpretations, partial correctness, and subjective judgment. Traditional metrics reporting only mean scores fail to capture this uncertainty. We present ORCA (Open-ended Response Correctness Assessment), a framework that models the variability in human judgments using Beta distributions to predict both expected correctness and uncertainty. Our three-stage annotation framework combines human judgment with structured feedback and iterative refinement to simultaneously curate training data and improve benchmark quality. We collected 11,721 annotations across 3,580 question-answer pairs from 15 LALMs on two audio QA benchmarks, achieving inter-annotator agreement of 0.82 (Krippendorff's alpha). ORCA achieves 0.91 Spearman correlation with mean human judgments, matching or outperforming LLM-judge baselines while providing uncertainty estimates and requiring significantly less compute. We release our models, code, and curated dataset.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2512.09066

Country:

Europe (1.00)
North America > United States (0.68)
Asia > Middle East > UAE (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Filters

Collaborating Authors

beta distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Probabilistic Model of Social Decision Making based on Reward Maximization

Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time

Overleaf Example

5fe1b43c882d746c187456eb4c8cdf52-Paper-Conference.pdf

c2f32522a84d5e6357e6abac087f1b0b-Paper.pdf

c1b70d965ca504aa751ddb62ad69c63f-Supplemental.pdf

BetaR-CNN: LookingintoPedestrianDetectionfrom AnotherPerspective

e43739bba7cdb577e9e3e4e42447f5a5-Paper.pdf

Beta Diffusion

ORCA: Open-ended Response Correctness Assessment for Audio Question Answering